Art Unit: 2183

Atty Docket: OT2.P59

## **IN THE CLAIMS**

81

1. (currently amended) A method for sharing a subinstruction of a given <u>VLIW</u> instruction among functional processing units of a plurality of clusters on a processor having a very long instruction word architecture, the given <u>VLIW</u> instruction including a set of control bits and at least one subinstruction, the processor comprising the plurality of clusters, each one cluster of the plurality of clusters comprising a plurality of functional processing units, the plurality of functional processing units executing the given <u>VLIW</u> instruction, the method comprising the steps of:

testing the set of control bits of the given VLIW instruction to identify a prescribed condition for sharing a subinstruction within the given VLIW instruction among multiple functional processing units;

when the prescribed condition is identified as a first prescribed condition, routing said shared subinstruction of the given VLIW instruction to multiple functional processing units as determined by the first prescribed condition;

when the prescribed condition is identified as a second prescribed condition,
routing said shared subinstruction of the given VLIW instruction to multiple functional
processing units as determined by the second prescribed condition, wherein the routing for the
second prescribed condition is different than for the first prescribed condition; and

concurrently executing the subinstruction at said multiple functional processing units.

2. (currently amended) The method of claim 1, in which the step of routing comprises routing said <u>shared</u> subinstruction of the given <u>VLIW</u> instruction to a first

Art Unit: 2183

Atty Docket: OT2.P59

functional processing unit of a first cluster of the plurality of clusters and to a first functional processing unit of a second cluster of the plurality of clusters; and in which the step of executing comprises concurrently executing the shared subinstruction at said first functional processing unit of the first cluster of the plurality of clusters and at the first functional processing unit of the second cluster of the plurality of clusters.

3. (currently amended) The method of claim 2, in which the given <u>VLIW</u> instruction comprises a first subinstruction and a second subinstruction, the step of testing comprising testing the set of control bits to identify a first prescribed condition, the step of routing comprising routing the first subinstruction, the method further comprising the steps of:

testing the set of control bits to identify a second prescribed condition;
when the second prescribed condition is identified, routing said second
subinstruction of the given instruction to a second functional processing unit of the first cluster
of the plurality of clusters and to a second functional processing unit of the second cluster of
the plurality of clusters and

concurrently executing the subinstruction at the first functional processing unit and the second functional processing unit; and

wherein the step of executing comprises concurrently executing the first subinstruction at the first functional processing unit of the first cluster, the first subinstruction at the first functional processing unit of the second cluster, the second subinstruction at the second functional processing unit of the first cluster and the second subinstruction at the second functional processing unit of the second cluster.

Art Unit: 2183

Atty Docket: OT2.P59

4. (currently amended) A method for storing an <u>a VLIW</u> instruction of a computer program to be executed on a processor having a very long instruction word architecture,

wherein each <u>VLIW</u> instruction comprises at least one subinstruction and up to a first prescribed number of subinstructions, the first prescribed number being at least two,

wherein the processor is organized into a plurality of clusters equaling a second prescribed number, each one cluster of the plurality of clusters comprising a common number of functional processing units, wherein the common number of functional processing units times the second prescribed number equals the first prescribed number,

wherein for a given instruction having the first prescribed number of subinstructions, each functional processing unit of the plurality of clusters is for executing a respective subinstruction of the given instruction, the method comprising, during compilation of the computer program, the steps of:

identifying a pattern in which a subinstruction occurs more than once in the given instruction, said subinstruction being a redundant subinstruction;

determining whether the pattern is among a set of prescribed patterns;
when the pattern is among the set of prescribed patterns, setting a set of control
bits for the instruction to indicate that said pattern is present.

5. (currently amended) The method of claim 4, further comprising, during compilation of the computer program, compressing the given instruction when the pattern is among the set of prescribed patterns by deleting one occurrence and leaving unchanged another occurrence of the redundant subinstruction in the given instruction to achieve a

Art Unit: 2183

Atty Docket: OT2.P59

eompressed compressed-length VLIW instruction; and storing the compressed-length VLIW instruction as part of the compiled computer program.

H

6. (currently amended) The method of claim 5, further comprising, during run time of the computer program, the steps of:

moving the compressed compressed-length VLIW instruction into an instruction cache;

instruction to determine a condition is identified in which subinstruction sharing is to occur for the compressed compressed length VLIW instruction;

when subinstruction sharing is determined to occur, parsing the compressed compressed-length VLIW instruction to route the redundant subinstruction to a plurality of functional processing units as determined by the identified condition;

concurrently executing the subinstruction at said plurality of functional processing units.

## 7-12. (cancelled)

500

13. (currently amended) A computer system comprising:

a processor having a very large word instruction architecture and including a plurality of clusters of functional processing units, each one cluster of the plurality of clusters comprising a common number of functional processing units, the processor comprising a first prescribed number of clusters, said very large word instruction architecture allowing an instruction to have up to a second prescribed number of subinstructions, where the second

Art Unit: 2183

Atty Docket: OT2.P59

prescribed number equals the first prescribed number times the common number, each instruction to be executed by the processor comprising from one subinstruction up to the second prescribed number of subinstructions, along with a set of control bits; and

an instruction cache memory which stores a first <u>VLIW</u> instruction in a compressed format determined by a condition of the set of control bits, the compressed format including a shared subinstruction stored in a given field of the first <u>VLIW</u> instruction which is to be shared by a plurality of the functional processing units, said plurality of functional processing units being determined by said condition of the set of control bits.

- 14. (original) The system of claim 13, in which said shared subinstruction is for a first functional processing unit of a first cluster and a first functional processing unit of a second cluster when the set of control bits identifies a first prescribed condition.
- 15. (original) The system of claim 14, in which the shared subinstruction is a first shared subinstruction, and in which the compressed format further includes a second shared subinstruction for a second functional processing unit of the first cluster and a second functional processing unit of the second cluster when the set of control bits either concurrently identifies a second prescribed condition.
- 16. (original) The system of claim 14, further comprising:

  means for testing the set of control bits for a given instruction; and

  means for routing said first common subinstruction to the first functional

  processing unit of the first cluster and to the first functional processing unit of the second

Art Unit: 2183

Atty Docket: OT2.P59

cluster of the plurality of clusters when said testing means identifies the first prescribed condition.

BX

- 17. (original) The system of claim 16, in which the first common subinstruction is concurrently executed at the first functional processing unit of the first cluster and the first functional processing unit of the second cluster.
- 18. (original) The system of claim 14, in which the first instruction in an uncompressed format includes the second prescribed number of subinstructions, the first instruction comprising a first subinstruction for being executed by a first functional processing unit of a first cluster and a second subinstruction for being executed by a first functional processing unit of a second cluster, the system further comprising means for compiling the first instruction, the compiling means comprising:

means for comparing the first subinstruction and the second subinstruction; means for setting a state of the set of control bits to identify a first prescribed condition when the first subinstruction is equal to the second subinstruction.

19. (original) The system of claim 14, in which a first instruction in uncompressed format includes the second prescribed number of subinstructions, the first instruction comprising a first subinstruction for being executed by a first functional processing unit of a first cluster and a second subinstruction for being executed by a first functional processing unit of a second cluster, the system further comprising means for compressing the first instruction into the compressed format, the compressing means comprising:

means for testing the set of control bits associated with the first instruction;

Art Unit: 2183

Atty Docket: OT2.P59

means for reducing the size of the first instruction by omitting the second subinstruction when the set of control bits identifies that the first subinstruction equals the second subinstruction.

20. (original) The system of claim 14, in which a first instruction in uncompressed format includes the second prescribed number of subinstructions, the first instruction comprising a first subinstruction for being executed by a first functional processing unit of a first cluster and a second subinstruction for being executed by a first functional processing unit of a second cluster, the system further comprising means for caching the first instruction, the caching means comprising:

means for testing the set of control bits associated with the first instruction; means for reducing the size of the first instruction to achieve a compressed format by omitting the second subinstruction when the set of control bits identifies that the first subinstruction equals the second subinstruction; and

means for loading the first instruction into the instruction cache in the compressed format.

21. (original) The system of claim 14, in which a first instruction in uncompressed format includes the second prescribed number of subinstructions, the first instruction comprising a first subinstruction for being executed by a first functional processing unit of a first cluster and a second subinstruction for being executed by a first functional processing unit of a second cluster, the system further comprising means for caching the first instruction, the caching means comprising:

means for comparing the first subinstruction and the second subinstruction;

Art Unit: 2183

Atty Docket: OT2.P59

means for setting a state of the set of control bits associated with the first instruction to identify a first prescribed condition when the first subinstruction is equal to the second subinstruction.

means for reducing the size of the first instruction to achieve a compressed format by omitting the second subinstruction when the set of control bits identifies that the first subinstruction equals the second subinstruction; and

means for loading the first instruction into the instruction cache in the compressed format.

22. (new) A method for processing a compressed-length VLIW instruction on a processor having a very long instruction word architecture, the compressed-length VLIW instruction including a set of control bits and at least one subinstruction, the processor comprising a plurality of clusters, each one cluster of the plurality of clusters comprising a plurality of functional processing units, the method comprising the steps of:

loading the compressed-length VLIW instruction into a cache;

determine distribution of the at least one subinstruction, wherein each one functional processing unit of the plurality of functional processing units of each cluster receives one of a no-operation subinstruction or a subinstruction expressly included among the at least one subinstruction;

wherein for a first prescribed condition of the set of control bits, at least one expressly included subinstruction -is routed to multiple functional processing units as determined by the first prescribed condition;

Art Unit: 2183

Atty Docket: OT2.P59

BY

wherein for a second prescribed condition of the set of control bits, at least one expressly included subinstruction is routed to multiple functional processing units as determined by the second prescribed condition, wherein the routing for the second prescribed condition is different than for the first prescribed condition; and

executing the compressed length VLIW instruction as distributed among the plurality of functional units of each cluster by concurrently executing the subinstructions received at the plurality of functional processing units of each cluster.